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Abstract. 

This is a collection of linguistic-mathematical approaches to Romanian rebus, poetical 
and juridical texts, and proposes fancies, recreational math problems, and paradoxes. We 
study the frequencies of letters, syllables, vowels in various poetry, grill definitions in 
rebus, and rebus rules. We also compare the scientific language, lyrical language, and 
puzzles’ language, and compute the Shannon entropy and Onicescu informational energy. 

INTRODUCTION 

The aim of this section is the investigation of some combinatorial aspects of 
written language, within the framework detennined by the well-known game of 
crossword puzzles. Various types of probabilistic regularities appearing in such puzzles 
reveal some hidden, not well-known restrictions operating in the field of natural 
languages. Most of the restrictions of this type are similar in each natural language. Our 
direct concern will be the Romanian language. 

Our research may have some relevance for the phono-statistics of Romanian. The 
distribution of phonemes and letters is established for a corpus of a deviant 
morphological structure with respect to the standard language. Another aspect of our 
research may be related to the so-called tabular reading in poetry. The correlation 
horizontal-vertical considered in the first part of the paper offers some suggestions 
concerning a bi-dimensional investigation of the poetic sing. 

Our investigation is concerned with the Romanian crossword puzzles published in 
[4]. Various concepts concerning crossword puzzles are borrowed from N. Andrei [3]. 
Mathematical linguistic concepts are borrowed from S. Marcus [1], and S. Marcus, E. 
Nicolau, S. Stati [2], 



SECTION 1. THE GRID 

§1. MATHEMATICAL RESEARCHES ON GRIDS 

It is known that a word in a grid is limited on the left and right side either by a 
black point or by a grid final border. 

We will take into account the words consisting of one letter (though they are not 
clued in the Rebus), and those of two (even they have no sense (e.g. N T, RU,...)), three 
or more letters - even they represent that category of rare words (foreign localities, rivers, 
etc., abbreviations, etc., which are not found in the Romanian Language Dictionary (see 
[3], pp. 82-307 (“Rebus glossary”)). 



1 





The grids have both across and down words. 

We divide the grid into 3 zones: 

a) the four peaks of the grid (zone A) 

b) grid border (without de four peaks) (zone B) 

c) grid middle zone (zone C) 

We assume that the grid has n lines, m columns, and p black points. 

Then: 

Proposition 1. The words overall number (across and down) of the grid is equal 
to n + m + pNB + 2 • pNC , where 

pNB = black points number in zone B , 

pNC = black points number in zone C . 

Proof: We consider initially the grid without any black points. Then it has 
n + m words. 

- If we put a black point in zone A , the words number is the same. (So it does not 
matter how many black points are found in zone A). 

- If we put a black point in zone B , e.g. on line 1 and column j , i < j < m , 
words number increases with one unit (because on line 1, two words were formed (before 
there was only one), and on column j one word rests, too). The case is analog if we put a 
black point on column 1 and line i, 1 < i < n (the grid may be reversed: the horizontal 
line becomes the vertical line and vice versa). Then, for each point in zone B a word is 
added to the grid words overall number. 

- If we put a black point in zone C , let us say i, 1 < i < n , and column j , 
1 < j < m , then the words number increases by two: both on line i and column j two 
words appear now, different from the previous case, when only one word was there on 
each line. Thus, for each black point in zone C , two words are added at the grid words 
overall number. From this proof results: 
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Corollary 1. Minimum number of words of grid n x m is n + m . Actually, this 
statement is achieved when we do not have any black points in zones B and C . 

Corollary 2. Maximum number of words of a grid n x m having p black points 
is n + m + 2 p and it is achieved when all p black points are found in zone C . 

Corollary 3. A grid n x m having p black points will have a minimum number 
of words when we fix first the black points in zone A , then in zone B (alternatively - 
because it is not allowed to have two or more black points juxtaposed), and the rest in 
zone C . 

Proposition 2. The difference between the number of words on the horizontal and 
on the vertical of a grid n x m is n - m + pNBO - pNBV , where 

pNBO = black points number in zone BO ? 



pNBV = black points number in zone BV . 

We divide zone B into two parts: 

- zone BO = B zone horizontal part (line 1 and n ) 

- zone BV = B zone vertical part (line 1 and m ). 

The proof of this proposition follows the previous one and uses its results. 

If we do not have any black points in the grid, the difference between the words 
on the horizontal and those on the vertical line is n - m . 

- If we have a black point in zone A , the difference does not change. The same 
for zone C . 

If we have a black point in zone BO , then the difference will be n - m - 1 . From 
this proposition 2 results: 

Proposition 3. A grid n x m has n + pNBO + pNC words on the horizontal and 
m + pNBV + pNC words on the vertical. 

The first solving method uses the results of propositions 1 and 2. 

The second method straightly calculates from propositions 1 and 2 the across and 
down words number (their sum (proposition 1) and difference (proposition 2) are 
known). 

Proposition 4. Words mean length (= letters number) of a grid n x m with p 

U1 , • * • ^ 2 (nm-p) 

black points is > . 

n + m + 2 p 

Actually, the maximum words number is n + m + 2p , the letter number is 
nm - p , and each letter is included in two words: one across and another down. One grid 
is the more crossed, the smaller the number of the words consisting of one or two letters 
and of black points (assuming that it meets the other known restrictions). Because in the 
Romanian grids the black points percentage is max. 

15% out of the total (rounding off the value at the closer integer - e.g. 15% with a 
grid 13x13 equals 25.35 « 25; with a grid 12x12 is 21.6 « 22), so for the previous 



properties, for grids n x m with p black points we replace p by 



JS_ 

20 



nm , where 



[x] = maxja e N, a-x< 0.5]. 



§2. STATISTIC RESEARCHES ON GRIDS 
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In [1] we find the notion “ecart of a sound x”, denoted by a(x ) , which equals the 
difference between the rank of x in Romanian and the rank of x in the analyzed text. 

We will extend this notion to the notion of a text ecart which will be denoted by: 
a(t ) , and 



a{t) = ~Y J \a(A i )\ 

n i = 1 

where a(A ( ) is A ; sound ecart (in [1]) and n represents distinct sounds number in text t . 
(If there are letters in the alphabet, which are not found in the analyzed text, these will be 
written in the frequency table giving them the biggest order.) 

Proposition 1 . We have a double inequality: 

where [y] represents the whole part of real number y . 



n n — 1 1 

0 < a(t) < h — 

2 n 



Actually, the first inequality is evident. 

(\ 2 ... TI ) n n 

Let O = . Then ^|a(A ( .)| = ^|z 

h h - in) ,=1 ,=1 



Ji 



This pennutation constitutes a mathematical pattern of the two frequency tables of 
sounds; in Romanian (the first line), in text t (the second line). 

(1 2 ... n — 1 n ^ 



For permutation i// = 



n n - 1 ... 



1 



we have 



n 

2 _ 



Y J \i-j i \ = 2[(n-l) + (n-3) + (n-5) + ...] = 2Y J (n-2k + l) = 
1 = 1 

= 2 \ 



k = 1 



n 




n 


i n(n - 1) 


n 


— 


n - 


— 


= + 


— 


_2_ 


\ 


_2_. 


) 2 


_2_ 



■ / \ n ~ 1 1 r n 

where a(t) = 1 — 

2 n 2 



By induction with respect to n> 2, we prove now the sum S = - j i \ has max. 

1 = 1 

value for pennutation i// . 

For n = 2 and 3 it is easily checked directly. Let us suppose the assertion true for 
values <n + 2. Let us show for n + 2: 



V 



r 1 2 ... n + 1 n + 2^ 

y n + 2 n + 1 ... 2 1 

Removing the first and last column, we obtain: 



f 



n + 1 



r [n + 1... 2 y 

which is a permutation of n elements and for which S will have the same value as for 
pennutation 
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V 



1 ... n 

\ n - b 



i.e. max. value ( i y" was obtained from yr' by diminishing each element by one). 

(\ 77 + 2^ 

The pennutation of 2 elements 77 = 



\ji + 2 1 j 



gives maximum value for S . 



But yr is obtained from y/' and 77 ; 

y/{i) = 



\y/\i), if i £ { 1,77 + 2} 

1 77 (f), otherwise 



Remark : The bigger one text ecart, the bigger the “angle of deviation” from the 
usual language. 

It would be interesting to calculate, for example, the ecart of a poem. 

Then the notion of ecart could be extended even more: 

a) the ecart of a word being equal to the difference between word order in 
language and word order in the text; 

b) the ecart of a text ( ref words): 

1 " 

a c (t) = -XK<A-)| > 

77 , =1 

where a c (a i ) is word a t ecart, and n - distinct words number in the text t . 



* 



We give below some rebus statistic data. By examining 150 grids [4] we obtain 
the following results: 
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Occurrence frequency of words in the grid, depending on their length (in letters ) 



Letter order 


Letter 


Letter 

occurrence 

mean 

percentage 


Vowels mean 
percentage 


Consonants 

mean 

percentage 


1 


A 


15.741% 






2 


I 


12.849% 






3 


T 


9.731% 






4 


R 


9.411% 






5 


E 


8.981% 






6 


0 


5.537% 






7 


N 


5.053% 






8 


U 


4.354% 


47.462% 


52.538% 


9 


s 


4.352% 






10 


c 


4.249% 






11 


L 


4.248% 






12 


M 


4.010% 






13 


P 


3.689% 






14 


D 


1.723% 






15 


B 


1.344% 






16 


G 


1.290% 






17 


F 


0.860% 






18 


V 


0.806% 






19 


Z 


0.752% 






20 


H 


0.537% 






21 


X 


0.430% 






22 


J 


0.053% 






23 


K 


0.000% 







It is easy to see that a percentage of 49,035% consists of the words formed only of 1, 2 or 
3 letters; - of course, there are lots of incomplete words. 

* 



The study of 50 grids resulted in: 

Occurrence frequency of words in a grid (see next page). 

It is noticed that vowels percentage in the grid (47.462%) exceeds the vowels percentage 
in language (42.7%). 

So, we can generalize the following: 

Statistical proposition (1): In a grid, the vowels number tends to be almost equal 
to 47.5% of the total number of the letters. 

Here is some evidence: one word with n syllables has at least n vowels (in 
Romanian there is no syllable without vowel (see [2]). 

The vowels percentage in Romanian is 42.7%; because a grid is assumed to form 
words across and down, the vowels number will increase. Also, the last two lines and 
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columns are endings of other words in the grid; thus they will usually have more vowels. 
When black points number decreases, vowels number will increase (in order to have an 
easier crossing, you need either more black points or more vowels) (A vowel has a bigger 
probability to enter in the contents of a word than a consonant.) 

Especially in “record grids” (see [3], pp. 33-48) the vowels and consonants 
alternation is noticed. Another criterion for estimating the grid value is the bigger 
deviation from this “statistical law” (the exception confirms the rule!): i.e. the smaller the 
vowel percentage in a grid, the bigger its value. 

Statistical proposition (2): Generally, the horizontal words number 73 equals the 
vertical one. 

Here is the following evidence: 100 classical grids were experimentally analyzed, 
in [4], getting the percentage of 49.932% horizontal words. Usually, the classical grids 
are square clues, the difference between the horizontal and vertical words being (see 
Proposition 2): 

n - m + pNBO - pNBV = pNBO - pNBV . 

The difference between the black points number in zone BO and zone BV can 
not be too big (±1, ±2 and rarely +3). (Usually, there are not many black points in 
zone B, because it is not economical in crossing (see proof of Proposition 1)). 

Taking from [1] the following letters frequency in language: 



l.E 


5.N 


9. L 


13. P 


17. G 


21. J 


2.1 


6.T 


10. S 


14. M 


18. F 


22.X 


3. A 


7.U 


11. O 


15. B 


19. Z 


23. K 


4.R 


8.C 


12. D 


16. V 


20. H 





(because in the grid A, A, I , §, T: are replaced by A: I: S: T, respectively, in the above 
order they were cancelled) the ecart of the 150 grids becomes 

«(s) = if>(A)|* 1-391; 

^ i = 1 

the entropy is: 

logioPi ~ 3-865 

log io 2 ,- =1 

and the infonnational energy (after O. Onicescu) is: 

£(k) = i>, 2 * 0.084 

1=1 

Examining 50 grids we obtain: 
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Words frequency in a grid with respect to the syllables number 



Mean percentage of occurrence of a word in a grid 


Mean 
length of 
a word in 
syllables 


1 

syllable 


2 


3 


4 


5 


6 


7 


8 




35.588% 


26.920% 


21.765% 


9.551% 


5.294% 


0.882% 


0.000% 


0.000% 


2.246 



(in the category of the one syllable-words, the word of one, two or, three letters, without 
any sense - rare words - were also considered.) One can see that the percentage of words 
consisting of one and two syllables is 65.508% (high enough). 

Another statistics (of 50 grids), concerning the predominant parts of speech in a 
grid has established the following first three places: 

1. nouns 45.441% 

2. verbs 6.029% 

3. adjectives 2.352% 

Notice the large number of nouns. 



SECTION II. REBUS CLUES 



§1. STATISTICAL RESEARCHES ON REBUS CLUES 

Studying the clues of 100 “clues grids”, the following statistical data resulted: 
Rebus clues frequency according to their length (words number ) 

(see the next page) 

It is noticed that the predominant clues are formed of 2, 3, or 4 words. For results 
obtained by investigating 100 “clues grids”, see the next page. 

It is worth mentioning that vowels percentage (46.467%) from rebus clues 
exceeds vowels percentage in the language (42.7%). 

By calculating the clues ecart (in accordance with the previous formula) it results: 

«(^) = ^I>(A)|«1-185 

^ ' i=1 
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(sound frequency used by Solomon Marcus in [1] was used here), the entropy (Shannon) 
is: 

H \ log loPi « 4.226 

log ,0 2 , =1 

and informational energy (O. Onicescu) is: 

27 

E(dr) = Y,P?~ 0-062 . 

i=i 

(The calculations were done by means of a pocket calculator ). 

Letters occurrence frequency in the rebus clues 



Letter 

order 


Letter 


Mean 
percentage 
of letter 
occurrence 
in clues 


Vowels 

percentage 


Conso- 

nants 

mean 

percentage 


Letters no. 

(mean) 
necessary 
to clue a 
grid 


Mean 
length of a 
word (in 
letters) 
used in 
clues 


1 


E 


10.996% 










2 


I 


9.778% 










3 


A 


9.266% 


46.679% 


53.321% 


657.342 


4.374 


4 


R 


7.818% 










5 


U 


6.267% 










6 


N 


6.067% 










7 


T 


5.611% 










8 


C 


5.374% 










9 


L 


4.920% 










10 


O 


4.579% 










11 


P 


4.027% 










12 


A 


3.992% 










13 


S 


3.831% 










14 


1 


3.309% 










15 


D 


3.079% 










16 


A 


1.801% 










17 


V 


1.527% 










18 


F 


1.449% 










19 


$ 


1.360% 










20 


T 


1.338% 










21 


G 


1.330% 










22 


B 


1.238% 










23 


H 


0.532% 










24 


J 


0.358% 










25 


Z 


0.092% 










26 


X 


0.037% 










27 


K 


0.024% 
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SECTION III. HYPOTHESIS ON THE DETERMINATION OF A RULE FOR THE 

CROSS WORDS PUZZLES 



The problems of cross words are composed, as we know, of grids and definitions. 
In the Romanian language one imposes the condition that the percentage of black boxes 
compared to the total number of boxes of the grid not to go over 15%. 

Why 15%, and not more or less? This is the question to which this article tries to 
answer. (This question is due to Professor Solomon MARCUS - National Symposium of 
Mathematiques "Traian Lalesco", Craiova University, June 10, 1982). 

First of all we present here a table which shows in a synthetic manner, a statistics 
on the grids containing a very small percentage of black boxes (of [2], pp. 27-29): 

THEGRIDS-RECORDS 



Grid dimension 


Minimum number 
of registered black 
boxes 


Percentage of 
black boxes 


Number of grids- 
records 

constructed until 
June 1 , 1982 


8x8 


0 


0.000% 


24 


9x9 


0 


0.000% 


3 


10x10 


3 


3.000% 


2 


11x11 


4 


3.305% 


1 


12x12 


8 


5.555% 


1 


13x13 


12 


7.100% 


1 


14x14 


14 


7.142% 


1 


15x15 


17 


7.555% 


1 


16x16 


20 


7.812% 


2 



In this table, one can see that the larger the dimension of the grid, the larger is the 
percentage of black boxes, because the number of long words is reduced. 

The current dimensions for grids go from 10x10 to 15x15. 

One can notice that the number of the grids having a percentage of black boxes 
smaller than 8 is very reduced: the totals in the last column represent all the grids created 
in Romania since 1925 (the appearance of the first problems of cross words in Romania), 
until today. It is thus seen that the number of the grid-records is negligible when one 
compares it with the thousands of grids created. For this reason, the rule that imposed the 
percentage of the black boxes, should have established to be greater than 8%. But the 
cross words being puzzles, they must address to a large audience, thus one did not have to 
make these problems too difficult. 

From which a percentage of black boxes at least equal to 10%. 

They must be not too easy either, that is not to necessitate any effort from those 
who would compose them, from where a percentage of black boxes smaller than 20%. (If 
not, in effect, it becomes possible to compose grids wholly fonned of words boxes of 2 or 
3 letters). 
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To support the second assertion, one assumes that the average length of the words 

2(^2 * 171 — 2 ?) 

of a n x m grid with p black boxes is sensible equal to (from [3]. § 1, Prop. 



n + m + 2 p 



4). For us, p is 20% of n ■ m , therefore it results that 



or 20 , 

2 m • m n ■ m) 

100 

„ 20 

n + m + 2 n ■ m 

100 



„ 1 1 2 
< 3o- + — > — 
n m 15 



Thus, for current grids having 20% of black boxes, the average lengths of the 
words would be smaller than 3. 

Similarly at the beginnings of the puzzle of cross words the percentage of black 
boxes were not too large: thus in a grid from 1925 of 1 lxll, one counts 33 black boxes, 
therefore a percentage of 27.272% (from [2], p. 27). 

While being developed, for these puzzles were imposed "stronger" conditions - 
that is a reduction in the black boxes. 

For selecting a percentage between 10 and 20%, it is supposed that the peoples’ 
predilection for round numbers was essential (the cross words are puzzles, no need for 
mathematic precision of sciences). That’s why the rule of 15%. 

A statistic (from [3], § 2), shows that the percentage of black boxes in the current 
grids is approximately 13.591%. The rule is thus relatively easy to follow and it can only 
attract new crossword enthusiasts. 

To completely answer the proposed question, one would need to consider also 
some philosophical, psychological, and especially sociological aspects, especially those 
connected to the history of this puzzle, its ulterior development, and with its traditions. 



REFERENCES 

[1] Marcus Solomon, Edmond Nicolau, S. Stati - “Introducere in lingvistica 
matematica”, Bucharest, 1966 (translated in Italian, Patron, Bologna, 
1971; in Spanish, Teide, Barcelona, 1978). 

[2] Andrei, Dr. N. - “Indreptar rebusist”, Editura Sport-Turism, Bucharest, 
1981. 

[3] Smarandache, Florentin - “A mathematical linguistic approach to Rebus”, 
published in “Review roumaine de linguistique”, Tome XXVIII, 1983, 
collection “Cahiers de linguistique theorique et appliquee”, Tome XX, 
1983, no. 1, pp. 67-76, Bucarest. 

[Published in “Caruselul enigmistic”, Bacau, Nr. 5, 1986, 2-6 May, pp. 29 and 31] 



12 




SECTION IV. THE LANGUAGE OF SPIRITUAL REBUS DEFINITIONS 



“The rebus’ language” is somewhere at the border of the scientific language and, 
that, perhaps, having many common things with usual language too, and even with the 
musical one (the puzzles, because they have a certain acoustic resonance). 

While the semantic deficiencies, having direct definitions (close to those from 
dictionary [3], pp. 50-56) of a language close to the scientific one (even to the usual one 
through the simple mode of expression) of “the grid’s definitions”. The language is close 
to the poetic one. There are even literary definitions (see [3], p. 57, [4]), which utilize 
literary stylistic procedures: like the metaphor, the comparison, the allegory, practice, etc. 
Later we will present a parallelism between the SCIENTIFIC LANGUAGE, POETIC 
LANGUAGE, REBUS’ LANGUAGE (“THE GRIDS’ DEFINITIONS”) closely 
following the rules from [1] (chap. “Oppositions between the scientific language and the 
poetic one”), results which we will limit to the rebus’ language. 



SCIENTIFIC LANGUAGE 


POETIC LANGUAGE 


REBUS’ LANGUAGE 


- rational hypothesis 


- emotional hypothesis 


- rational + emotional 
hypothesis (reading the 
definition, you think for an 
instant, sometimes you go 
on a wrong road; when you 
err the answer (the 
corresponding word from 
the grid, you get 
enlightened and enthusiast). 


- logical density 


- density of suggestion 


- logical density + 
suggestion (the definition 
must use very few words to 
explain a lot - logical 
density); to be unpublished, 
enlightening, emotional 
(density of suggestion). 


- infinite synonymy 


- absent synonymy 


- reduced synonymy (not 
truly infinite, but not 
absurd); (two identical 
words from the grid cannot 
have more than one rebus 
definition; but a definition 
will be almost uniquely 
expressed, therefore the 
synonymy is quasi absent). 


- absent anonymity 


- infinite anonymity 


- large anonymity (neither 
absent, nor infinite) (in the 
case of the definition, the 
meaning is up to the author: 
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even if the reader 
understands something else, 
it will intervene the rational 
part, the word must fulfill 
the proper place in the grid, 
even the literary definitions, 
in the grids, don’t have 
anymore an infinite 
anonymity, because here 
intervene also the rational 
part: the finding by all 
means of an answer: in the 
case of the theme grids with 
direct definitions, the 
anonymity is almost 
absent). 


- artificial 


- natural 


- natural and artificial (in 
general the definitions have 
a natural character; but the 
definitions based on letter’s 
puzzles (example, the 
definition “Night’s 
beginning” has the answer 
“NI” have an artificial 
character). 


- general 


- singular 


- singular and general (only 
the definitions based on the 
puzzles of letters may have 
a general character). 


- translatable 


- untranslatable 


- translatable (in the sense 
that the definition has a 
logical meaning). 


- the presence of style 
problems 


- the absence of style 
problems 


- the absence of style 
problems (the same 
definition cannot be used 
without changing the 
nuance - while a word in 
the grid can be defined in 
multiple ways). 


- finitude in space, constant 
in time 


- variability in space and 
time 


- the variability in space and 
time, smaller variability 
than that from the poetic 
language. 


- numerable 


- innumerable 


- innumerable 


- transparent 


- opaque 


- semi-opaque (or 
semitransparent - at the 
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beginning the definition 
seems opaque, until one 
finds the answer). 


- transitive 


- reflexive 


- reflexive (except, again, 
the definitions based on 
games of letters, which 
have also a transitional 
character). 


- independency on 
expression 


- dependency on expression 


- dependency on expression. 


- independency on musical 
structure 


- dependency on musical 
structure 


- dependency on musical 
structure. 


- paradigmatic 


- syntagmatic 


- syntagmatic 


- concordance between the 
paradigmatic and 
syntagmatic distance 


- non concordance between 
the paradigmatic and 
syntagmatic distance 


- the paradigmatic and 
syntagmatic distance (are 
pairs of different words, 
word games, methods used 
ass in poetry). 


- short contexts 


- long contexts 


- short contexts (1) (here it 
is closer to the scientific 
language, because it is 
taken into account the Latin 
proverb ‘Won multa sed 
multum from the anterior 
statistic investigations it 
resulted that the medium 
length of a (spiritual) rebus 
definition is 4.192 words: 
the definitions with letter 
puzzles usually have very 
few words. 


- contextual dependency 


- it tends towards 
expression independency 


- contextual dependency (in 
the case of the theme grids 
it is also a small 
dependency; there exist also 
rare cases when a definition 
is dependent of an anterior 
definition (usually the 
definitions with letters or 
word games)). 


- logic 


- illogic 


- logic 


- denotation 


- annotation 


- connotation (if a definition 
would reveal the direct 
meaning of an word, we 
would have direct 
definitions (like in a 
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dictionary)) and then we 
would totally loose “the 
surprise”, “the spirituality”, 
“the ingenious”, “the 
spontaneity” of thematic 
grids, the definitions with 
denotative character. 


- routine 


- creation 


- creation and . . . experience 
(not to call it routine!) 


-general stereotypes 


- personal stereotypes 


- personal stereotypes (it 
exists even the so called 
grids of “personal manner” 

- (see [3], pp. 56-58) 


- explicable 


- ineffable 


- ineffable . . . which 
explains it! (Taken 
separately, the definition, 
not-seen as a question, is 
ineffable taken along, with 
the answer becomes 
explicable: in general, the 
definition presents also an 
ambiguity degree (more 
tracks for guidance) - 
otherwise it would be banal 

- a degree of 
indetennination: it is used 
many times the proper sense 
instead of the figurative 
one, or reciprocally defined 
it has also its own logic, 
which becomes tangible 
once one finds the answer). 


- lucidity 


- magic 


- magic - lucidity (in 
accordance with those that 
are immediately anterior) 

(at the beginning the rebus 
language dominates the 
person, until he finds the 
“key” when he’ll become at 
his turn the dominant - the 
poetic language. 


- predictable 


- unpredictable 


- at the beginning is 
unpredictable, and becomes 
predictable after solving it: 
(unpredictable converted in 
predictable) . 
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CONSIDERATIONS REGARDING THE SCIENTIFIC LANGUAGE AND 

“LITERARY LANGUAGE” 

As in nature nothing is absolute, evidently there will not exist a precise border 
between the scientific language and “the literary” one (the language used in literature): 
thus there will be zones where these two languages intersect. 

In [1], chapter “Instances between the scientific and poetic languages”, Solomon 
Marcus presents the differences between these two, differences that make them closer. 

We will skate a little on the edge of this material, presenting common parts of the 
scientific language and the literary language: 

- both are geared to find the unpublished, the novelty 

- both suppose a creative process (finding the solution of a problem means 
creation: writing of a phrase the same). 

- both literature and science have an art of being taught, studied and learned (the 
methodology of teaching arithmetic, or Romanian language, etc.) . 

- in science too there is an esthetic (for example: “the mathematical esthetic”), the 
same in literature there exists a logic (even the absurd of Eugene Ionesco, the myths of 
Mircea Eliade have their own specific logic: analogously, we can extend the idea to 
Tristan Tzara’s Dadaism, which has a specific logic (of construction; one cuts words 
from newspapers, mix them, and then form verses). 

- the scientific development implies a literary development in a special sense: it 
appeared, thus, the science-fiction literature in literary writings which use informations 
obtained by science: contemporaneous literature treats also scientific problems (for 
example Augustin Buzura wrote the roman “The absents” describing the life of a medical 
researcher: the engineer poet George Stanca introduces technical terms in his poems; one 
verse from his volume “Maximum tenderness” sounds: “sin 2 x + cos 2 x = 1 ”!); 
analogously the engineer poet Gabriel Chifu (the volume “An interpretation of the 
Purgatory”) and mathematics professor Ovidiu Florentin, author of a volume even 
entitled “Formulas for the spirit” - each poem being considered as a momentous 
“formula” (depending of time, place, space, individual) for the spirit. 

- even the writing of some contemporary novels inspired from the worker’s and 
peasant’s life requires a scientific documentation from the writers’ part. 

The literature has an esthetic influence for science; there exist mathematical 
metaphors (see [1], [2]) and, in general, we can say “scientific metaphors”, one cannot 
know what ideas and relations will be discovered in science. The understanding degree 
(exegesis) of a poetry and of a literary text in general, depends also of the culture’s 
degree of each individual, of his initiation (the seniority in that domain), of his scientific 
knowledge. 

- there are many scientists who, besides their scientific works, write also literary 
works or related domains (for example, the memories book of the academician 
(mathematician) Octav Onicescu “On the life’s roads”, the renown Romanian physician 
Gheorghe Marinescu writes poems (using Dacic words), under the penname George 
Dinizvor, the great Ion Barbu - Dan Barbilian excelled as a poet and as a mathematician. 
The great poet Vasile Voiculescu was a good physician; and the mathematics professor 
Aurel M. Buricea writes poetry, analogously the mathematician Ovidiu Florentin - 
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Florentin Smarandache writes poems and mathematics articles; in the world literature we 
find the poet-mathematician Omar Khayyam and Lewis Caroll - Charles L. Dodgson), 
but writers that would do fundamental scientific or technical research don’t quite exist! 
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SECTION V. THE LETTERS’ FREQUENCY (BY EQUAL GROUPS) IN THE 

ROMANIAN JURIDICAL TEXTS 

Analyzing the deterioration’s degree of the keys of a typing machine which 
functioned for more than 40 years at the clerk's office of a court of a Romanian 
District (Valcea), one partitions them in the following groups: 

1) Letters completely deteriorated (one cannot read anything anymore on the 
typewriter). 

2) Letters from which one sees only one point, hardly perceptible. 



10) Letters from which is missing only one point. 

11) Letters, which are seen perfectly, without anything missing. 

12) Letters which, almost have not been touched, being covered with dust. 

The following resultants were obtained: 



1) E, A 


7) 0, C,U,D,Z 


2)1 


8) N 


3) R 


9) L 


4) T 


10) V, M 


5) S 


11) F, G, B, H, X, J, K 


6) P 


12) W, Q, Y 



This classification is a little different of that of [1], because the letters A, A, A are 
here counted as one letter: A, The same I and I in I, S and § in S, T and T in T. 

By studying the chart of this text (from [2]), we obtain: 

«(/> = 2 -348 

i = 1 

thus the chart of the juridical language of current frequencies is much more larger than 
that of the cross words language: a(g ) « 1.391 and a{d r ) « 1.185 . 

The letters P, Z and N realized the most spectacular jump: 



a(P) = 6, a(Z) = 7 , a(N) = 8 . 



Perhaps this article surprises by its banality. But, whereas other authors spent 
month of calculations using computers, choosing certain books and counting the letters 
(!) by the computer, I have deducted this frequency of the letters in a few minutes (!), by 
a simple observation. 
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SECTION VI. LINGUISTIC-MATHEMATICAL STATISTICS IN RECENT 

ROMANIAN POETRY 



“Mathematics is logical enough to be able to detect the internal logics of 
poetry and crazy enough not to lag behind the poetic ineffable” (Solomon Marcus). 

The author of this article aims a statistical investigation of a recently published 
volume of poetry [3], which will make possible some more general conclusions on the 
evolution of poetry in the XX th century (either the literary current hennetism, surrealism 
or any other). Certain modifications in the structure of poetry, occurred in its evolution 
from classicism to modernism, are also presented. Men of letters have never agreed with 
mathematics and, especially, with its interference in art. Let us quote one of them: 
“Remarque que, a mon avis, tout literature est grotesque... (...) La seule excuse de 
l’ecrivain c’est de se rendre compte qu’il joue, que la litterature est un jeu” (Eugene 
Ionesco). Well, if literature is a game why could not be subjected to mathematical 
investigation? 

The book chosen for this study (see [3]) contains 44 poems (from which the first 
and the last are sort of poems essays on Romanian poetry). It comprises over 250 
sentences, over 700 verses, over 2,500 words and over 1 1,700 letters (not sounds). 

MORPHOLOGICAL ASPECTS 

1. The frequency of words depending on the grammatical category they belong to. 



1. Nouns 


35.592% 




2.Verbs (predicat.moods) 


13.079% 


“Empty” words 


3. Adjectives 


6.183% 


40.271% 


4. Adverbs 


4.829% 




“Full” words 


59.729% 





1. The “full” words category includes - according to the author - nouns, verbs (predicative moods only), 
adjectives and adverbs. The “empty” words category includes verbs (i,e, infinitives, gerunds, poet 
participles, supines), numerals, articles, pronouns, conjunctions, prepositions and interjections. The same 
terminology was also used by Solomon Marcus in his “Poetica matematica” published by Ed. Academiei, 
Bucharest, 1970 (it was translated in German and published by Athenaum, Frankfurt-am-Mein, 1973). 
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2. The average distribution of “full” words' per verses (lines), sentences, poems 



a) 1.255 


nouns/line 


b) 0.461 


verbs (p.m)/line 


c) 0.218 


adjectives/line 


d) 0.172 


adverbs/line 


e) 3.464 


nouns/sentence 


f) 1.273 


verbs (p.m)/sentence 


g) 0.602 


adj ectives/ sentenc e 


h) 0.475 


adverbs/sentence 


i) 20.393 


nouns/poem 


j) 7.492 


verbs (p.m)/poem 


k) 3.543 


adjectives/poem 


1) 2.792 


adverbs/poem 



We may conclude: 

CONJECTURE 1. In the recent Romanian poetry the percentage of adjectives is, 
on average, under that of the total of words. 

CONJECTURE 2. The percentage of verbs (predicative moods) is., on average, 
under 15% of the total of the total words. 

In support of conjectures 1 and 2 we also mention: 

- only one in six nouns is modified by an adjective, i.e. the role of the adjective 
diminishes and there are poems with no adjectives (see [3], pp. 9, 12, 20); 

- on average, there is one verb in a predicative mood in more than two lines, i.e. 
the role of the verbal predicate decreases and there are poems with no verbal predicates 
(see [3], p. 20); 

(From classicism to modernism both adjectives and verbal predicates gradually 
but constantly regressed). 

- the poetry of the young poets is characterized by economy of words and, 
implicitly, by the avoidance of the overused words; the adjectives were favored by the 
romantics and the young poets feel the necessity to “renew” poetry; 

- this renewal and effort to avoid the trivial may be also helped by elimination of 
adjectives. The strict use of adjectives or verbal predicates is also accounted for by the 
characteristics of the two main literary currents of our century. 

a) hermetism - appeared after World War I - consists, mainly in the hyper 
intellectualization of language and its codification; an adjective (i.e. an explanation 
concerning an object) or the predicative mood of a verb (strict definition of the 
grammatical tense) may diminish the degree of ambiguity, generalization or abstraction 
intended by the poet. 

b) Surrealism - literary of vanguard - aimed at detecting the irrational, the unconscious, 
the dream; because of its precise definite character, the adjective makes the reader 
“plunge” into the so carefully avoided real world. 
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CONJECTURE 3. In the recent Romanian poetry percentage of “full” words is 
over 55% of the total words. 

Unlike in the spoken language in which the percentage of “full” and “empty” 
words is equal (see [1]) in poetry the percentage of “full” words is greater. This is due to 
the fact that poetry is essence, it is dense, concentrated. The percentage of “full” words 
and the “density” of a literary work are directly proportional. 

As a conclusion to the three conjectures we may say that: 

- in its evolution from classicism to modernism the percentage of nouns increased, 
while that of verbs decreased, less adverbs are used, on the other hand, because of the 
smaller number of verbs. In all, however, the percentage of “full” words increased. 

3. The frequency of the nouns with and without an article. 



1 . Percentage of nouns with an article - 47.884% 

2. Percentage of nouns without an article - 52. 1 16% 



CONJECTURE 4. In the recent Romanian poetry the number of nouns with an 
article is, on an average, smaller than the number of those without an article. With an 
article the noun is more definite, specified which are characteristics undesirable from the 
same viewpoint as that mentioned above. That is why the indefinite article is favored in 
modem poetry. The consequence of this preferred indefinite character of the noun 
enlarges the abstraction, generalization, ambiguity and, hence, the “density “ of the poem. 
(See also the second part of assertions 1 and 2 and the statistical conjecture 3). In its 
evolution from classicism to modernism the number of nouns without an article used in 
poetry also increased. 

4. The frequency of nouns depending on the grammatical case they belong to. 



Nominative 


Genitive 


Dative 


Accusative 


Vocative 


29.497% 


19.888% 


0.335% 


50.056% 


0.224% 


2 


3 


4 


1 


5 


Tc L 


ASS 


I F I C 


A T I 


o nT 



CONJECTURE 5. In the poems under study, over 75% of the nouns are 
accusative or nominative. 



5. Sentences, lines, words, syllables, letters - average relationships 
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a) 2.402 


letters/syllable 


b) 1.933 


syllables/word 


c) 4.643 


letters/word 


d) 3.528 


words/line 


e) 6.820 


syllables/line 


f) 16.380 


letters/line 


g) 2.760 


lines/sentence 


h) 9.737 


words/sentence 


i) 18.823 


syllables/sentence 


j) 45.208 


letters/sentence 


k) 5.887 


sentences/poem 


1) 16.250 


lines/poem 


m) 57.330 


words/poem 


n) 110.825 


syllables/poem 


o) 266.175 


letters/poem 



Conclusion: the poems are of medium length; the lines are short while the 
sentences are, again, of medium length. 



6. The frequency of words according to their length (in syllables) 



Syllables 


Percentages 


Order 


1 


41.509% 


1 


2 


32.069% 


2 


3 


19.363% 


3 


4 


5.688% 


4 


5 


1.371% 


5 


6 


0.000% 


6 



The total number of syllables in the volume is ... 4,800. The frequency of words and 
their length (in syllables are in inverse ratio. Long words seem “less poetical”. 

CONJECTURE 6. In the recent Romanian poetry the percentage of words of one 
and two syllables is ... 75%. Again, it seems that short and very short words (of one and 
two syllables) appear more adequate to satisfy the internal rhythm of the poem. Longer 
words already have their own rhythm dictated by the juxtaposition of the syllables; it is 
very probable that this rhythm comes into ... with the rhythm imposed by the poem. 
Shorter words are more easily uttered; longer words seem to render the text more 
difficult. 

7. The frequency of words according to their length (in letters) 
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1 letter 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


co • vo o ^ 


25.426% 


8.475% 


11.089% 


13.347% 


13.149% 


13.703% 


5.861% 


3.129% 


1.149% 


0.752% 


0.237% 


0.079% 


£ 

o 

o 

o 

o 


Order 8 


1 


6 


5 


3 


4 


2 


7 


9 


10 


11 


12 


13 


14 



In the whole volume there are only two words of 13 letters and 6 of twelve. A 90% of the 
words consist of no more than 7 letters. 

CONJECTURE 7. In the recent Romanian poetry the percentage of the two letter 
words is, on average, about 25% of the words. In fact, the same percentage, or even 
higher, is found in the ordinary language. Because of esthetic reasons in poetry there is a 
slight tendency of reducing the frequency of the two letter words - which are especially, 
prepositions and conjunctions. 

8. The frequency of the letters: 



The order of 
the letter 


Letter 


The average % 
of the 

frequency of 
the letter 


The average % 
of vowels 


The average % 
of cons 


1 


E 


11.994% 






2 


I 


10.166% 






3 


A 


8.406% 






4 


R 


7.680% 






5 


N 


6.407% 






6 


U 


6.347% 






7 


T 


5.792% 






8 


L 


5.237% 






9 


C 


5.143% 


46.865% 




10 


S 


4.220% 






11 


0 


3.699% 






12 


p 


3.451% 






13 


A 


3.417% 




53.135% 


14 


M 


3.178% 






15 


D 


2.981% 






16 


I 


2.828% 






17 


V 


1.435% 






18 


G 


1.48% 






19 


B 


1.358% 






20 


$ 


1.281% 






21 


F 


1.179% 






22 


Z 


0.846% 






23 


T 


0.803% 






24 


H 


0.496% 






25 


J 


0.196% 







25 






26 


X 


0.034% 






27 


A 


0.008% 






28-31 


K 


0.000% 






28-31 


0 


0.000% 






28-31 


Y 


0.000% 






28-31 


W 


0.000% 







CONJECTURE 8. In the recent Romanian poetry the percentage of vowels is, on 
average, over 45% of the total of letters. 

Explanation: in the ordinary language the percentage of vowels is 42.7% (see [1]). 
In poetry it is greater because: 

- vowels are more “musical” than consonants; therefore the words with more 
vowels “seem” more poetical; words with many vowels confer a special sonority to the 
text; 

- modem poets and poetry are more preoccupied by form than by content, so that 
more attention is given to expression; the form may prejudice the content, because, very 
often, the reader is “caught” by sonority and less by essence; 

- the internal rhythm of poetry, usually absent in the ordinary language, is also 
conditioned, partially, by a greater number of vowels; 

- rhyme, when used, also favors a greater percentage of vowels. The percentage of 
vowels was greater in the period of classicism of poetry when the rhythm and rhyme 
were more frequently used. The special requirements of poetry impose a thorough 
filtration of the ordinary language. 

Given the frequency of the letters in the Romanian language [1] in general: 



1 . 


E 


5. N 


9. L 


13. D 


17. S 


21. F 


25. J 


2. 


I 


6. T 


10. S 


14. P 


18. B 


22. T 


26.X 


3. 


A 


7.T 


11. O 


15. M 


19. V 


23. Z 


27. K 


4. 


R 


8. C 


12. A 


16.1 


20. G 


24. H 





we may calculate the deviation of this volume of verses from the ordinary language: 
a (v) = ^f>(4>b0.741 

2 . / i= i 

where ot(A.) is the deviation of the letter A, , 1 < i < 27 . 

The infonnational energy, according to O. Onicescu, is 

£(v) = f>?* 0.064, 

i= 1 

where p n 1 < i < 27 , is the probability that the letter p. may appear in the volume (see 

[i]). 

The first order entropy of the volume (according to Shannon) is: 

Hi(v) = - 1 ■ Yj P> lo g 10 P‘ « 4.222 . 

Iogio2 , =1 
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9. The themes of the volume are studied by determining the recurrent elements, 
those that seem to obsess the poet. We will call these elements “key-words” and they are, 
in order: nouns, verbs, adjectives. Their frequency in the volume is studied. The more 
frequent words are all included in common notional spheres that will “decode” the 
themes dealt with by the poet in the volume under study, i.e.: 
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These 33 key- words (together with their synonyms) confer certain pastoral note 
(this was noticed by Constantin Matei, the newspaper ’’Inainte”, Craiova), cosmological 
(Constantin M. Popa), existentialist nuances (Aureliu Goci, “Luceafarul”, Bucharest); the 
preoccupation of the poet for the condition of the poet and society (Ion Pachia 
Tatomirescu, Craiova) is also revealed by the frequent use of certain suggestive words. 

Of all the words, 33 key-words together with their synonyms have the greatest 
frequency in the volume. 

10. The frequency of words and phrases strongly deviated from the “normal”, i.e. 
the rules of the literary language are about 1.980 of the total of words. (We mean 
expressions like: “state of self’, “very near myself’, “it is raining at plus infinite” or 
words like “nontime”, etc. (see [3], pp. 9, 29, 40, 31). 

CONJECTURE 9. In the recent Romanian poetry the percentage of words and 
phrases that strongly deviated from the “normal” of the ordinary language, as well as the 
rules of the literary language, is slightly over 1 . This fact may be accounted for by: 
content seems less important; poets are more concerned with form; 
poets invent words and expressions to be able to better reveal their feelings 
and emotions; 

the association of antonyms may give birth to constructions that, somehow 
“violate” the normal; 

poetry is, in fact, destined to break the rules and rebel against the ordinary fact 
(if, this right is denied, any newspaper article could be called poetry). 

“In art” said Voltaire, “rules are only meant to be broken”. 

In its evaluation from classicism to modernism the percentage of such abnormal 
words and constructions increased, starting, in fact from zero. Modern literary currents 
favor the appearance of them. 
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